Abstract
Acute myeloid leukaemia (AML) is a haematological cancer in the bone marrow, with accumulation and expansion of immature cells of the myeloid lineage. Standard treatment of childhood AML is chemotherapy, which does not achieve durable remission in many patients. Personalised medicine including immunotherapies against surface antigens can potentially target chemotherapy resistant cells and achieve long-term remission. Identifying AML blasts and suitable targets for AML therapy are hampered by the heterogeneity and complex clonal and phenotypic composition of the cancer, as well as its complex evolution as the disease progresses into relapse. Aberrant expression of surface antigens on AML blasts measured by flow cytometry is an established strategy to monitor measurable residual disease. Although the likelihood to find blasts is increased as more markers are included in a staining, manual interrogation of all possible combinations exceeding 10 markers becomes extremely difficult. We aim to identify malignant cells based on expression of 23 surface markers normally or aberrantly expressed during hematopoiesis using spectral flow cytometry and machine learning. The goal of this high-dimensional approach is to distinguish leukaemia cells from normal myeloid cells with high confidence, and to place the blasts along the normal myeloid developmental trajectory.
We built a single-cell AML map from 20 paediatric AML patients enrolled in the Children's Oncology Group Phase III trial, AAML1031. All patients were treated with standard chemotherapy on Arm A and consented to provide tissue for research. Cryopreserved samples were obtained for three time-points: diagnosis, remission, and relapse. We assumed full remission samples to be healthy and used them to train variational auto-encoders with different hyper-parameters in a cross validated manner. These models subsequently learned to encode the expression of healthy cells in a reduced dimension and to reconstruct their expression from this latent space. We selected the top-10 models that best reconstruct the healthy cells and used it to encode and reconstruct cells from non-remission samples. Since the model works well for encoding and reconstructing healthy cells, we expect a good reconstruction of healthy cells in non-remission samples. For malignant cells we expect a poor reconstruction, as their expression pattern differs from healthy cells. Thus, we use the reconstruction error of cells from non-remission samples to classify them as healthy or malignant. We annotated some cells with clear aberrant expression as malignant and shuffled them with remission cells not used for model development. This resulted in multiple synthetic mixtures with known percentages of malignant cells (ranging from 10 to 90% blasts). Using these synthetic mixtures, we evaluated how well we can classify malignant cells based on their mean-squared reconstruction error. We can achieve an average area under the receiving operator curve of 0.9 over all synthetic mixtures (Fig. 1). The latent space of the variational auto-encoder captures the developmental trajectory of myeloid lineage bone marrow cells (Fig. 2), and we can use this to identify where among the developmental trajectory the leukaemia occurs.
We aim to use the trajectory assignment to study the time evolution of the disease in terms of the distribution of malignant cells across the myeloid lineage as well as investigate differences between diagnosis and relapse. We expect that this work will reveal patterns of evolution from diagnosis to relapse that will inform the development of novel strategies to predict and prevent relapse.
Disclosures
Alberti:Philogen S.p.A.: Current Employment. Becher:Numab: Membership on an entity's Board of Directors or advisory committees.
Author notes
Asterisk with author names denotes non-ASH members.